Mining Strongly Correlated Intervals with Hypergraphs

نویسندگان

  • Hao Wang
  • Dejing Dou
  • Yue Fang
  • Yongli Zhang
چکیده

Correlation is an important statistical measure for estimating dependencies between numerical attributes in multivariate datasets. Previous correlation discovery algorithms mostly dedicate to find piecewise correlations between the attributes. Other research efforts, such as correlation preserving discretization, can find strongly correlated intervals through a discretization process while preserving correlation. However, discretization based methods suffer from some fundamental problems, such as information loss and crisp boundary. In this paper, we propose a novel method to discover strongly correlated intervals from numerical datasets without using discretization. We propose a hypergraph model to capture the underlying correlation structure in multivariate numerical data and a corresponding algorithm to discover strongly correlated intervals from the hypergraph model. Strongly correlated intervals can be found even when the corresponding attributes are less or not correlated. Experiment results from a health social network dataset show the effectiveness of our algorithm.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Mining top-k strongly correlated item pairs without minimum correlation threshold

Given a user-specified minimum correlation threshold and a transaction database, the problem of mining strongly correlated item pairs is to find all item pairs with Pearson's correlation coefficients above the threshold. However, setting such a threshold is by no means an easy task. In this paper, we consider a more practical problem: mining top-k strongly correlated item pairs, where k is the ...

متن کامل

Hypergraphs and fast mining of association rules

The activities I’ve done in this past year can be summarized as follows: Nov. 2002-Jun. 2003 : I started my doctorate working in the field of Information Retrieval Jul. 2003 : I followed a two weeks summer school at Lipari focused on “Data Mining and Pattern Matching” May. 2003-now : I started to study directed hypergraph, and in particular I focused on algorithms for optimal hyperpaths. I’ll c...

متن کامل

Pattern Mining for General Intelligence: The FISHGRAM Algorithm for Frequent and Interesting Subhypergraph Mining

Fishgram, a novel algorithm for recognizing frequent or otherwise interesting sub-hypergraphs in large, heterogeneous hypergraphs, is presented. The algorithm’s implementation the OpenCog integrative AGI framework is described, and concrete examples are given showing the patterns it recognizes in OpenCog’s hypergraph knowledge store when the OpenCog system is used to control a virtual agent in ...

متن کامل

Cellular resolutions of cointerval ideals

Minimal cellular resolutions of the edge ideals of cointerval hypergraphs are constructed. This class of d–uniform hypergraphs coincides with the complements of interval graphs (for the case d = 2), and strictly contains the class of ‘strongly stable’ hypergraphs corresponding to pure shifted simplicial complexes. The polyhedral complexes supporting the resolutions are described as certain spac...

متن کامل

On Point Covers of Multiple Intervals and Axis-Parallel Rectangles

In certain families of hypergraphs the transversal number is bounded by some function of the packing number. In this paper we study hypergraphs related to multiple intervals and axis-parallel rectangles, respectively. Essential improvements of former established upper bounds are presented here. We explore the close connection between the two problems at issue.

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2015